Skip to content

Add target_feature support for compute_*#239

Merged
LegNeato merged 1 commit intoRust-GPU:mainfrom
LegNeato:target_feature
Jul 28, 2025
Merged

Add target_feature support for compute_*#239
LegNeato merged 1 commit intoRust-GPU:mainfrom
LegNeato:target_feature

Conversation

@LegNeato
Copy link
Copy Markdown
Contributor

@LegNeato LegNeato commented Jul 28, 2025

This lets us gate code to virtual architectures at compile time using cfg(). target_arch is already taken by ptx64, so we can't use that and arguably this matches closer to rust gpu's use of target_feature.

This lets us gate code to virtual architectures at
compile time using `cfg()`.
@LegNeato LegNeato merged commit 5a70839 into Rust-GPU:main Jul 28, 2025
7 checks passed
@LegNeato LegNeato deleted the target_feature branch July 28, 2025 00:45
nnethercote added a commit to nnethercote/Rust-CUDA that referenced this pull request Nov 25, 2025
CUDA C++ has the `__CUDA_ARCH__` macro for conditional compilation.
rust-cuda has a `CUDA_ARCH` environment variable that is similar, and
the `from_cuda_arch_env` method parses the environment variable's value
to produce a value of type `ComputeCapability`, which can be queried for
conditional compilation.

But `ComputeCapability` has a big problem. It's missing all the
capabilities after 80, including the 'a' and 'f' suffix ones. We could
just add them, but it implements `PartialOrd`/`Ord` and uses ordering to
determine feature availability. This was valid before the 'a' and 'f'
suffixes were added but is no longer, because some pairs of values are
incomparable. E.g. `100a` and `101a` -- each one has some features the
other doesn't, so neither is clearly larger than the other, and they're
also not equal.

So, what to do? Well, `CUDA_ARCH` was added in 2022. More recently,
another mechanism for conditional compilation was added:
`target_feature`, in Rust-GPU#239. This does work with the 'a' and 'f' suffix
targets, and it's more Rust-y.

So this commit just removes `CUDA_ARCH` and `ComputeCapability`
(removing two more places where the default compilation target is
specified) and changes the only uses (in `cuda_std/src/atomic/mid.rs`)
to use `target_feature` instead. We don't have any tests exercising
conditional compilation, alas, but I did some manual checking locally to
verify that it works the same.
nnethercote added a commit to nnethercote/Rust-CUDA that referenced this pull request Nov 25, 2025
CUDA C++ has the `__CUDA_ARCH__` macro for conditional compilation.
rust-cuda has a `CUDA_ARCH` environment variable that is similar, and
the `from_cuda_arch_env` method parses the environment variable's value
to produce a value of type `ComputeCapability`, which can be queried for
conditional compilation.

But `ComputeCapability` has a big problem. It's missing all the
capabilities after 80, including the 'a' and 'f' suffix ones. We could
just add them, but it implements `PartialOrd`/`Ord` and uses ordering to
determine feature availability. This was valid before the 'a' and 'f'
suffixes were added but is no longer, because some pairs of values are
incomparable. E.g. `100a` and `101a` -- each one has some features the
other doesn't, so neither is clearly larger than the other, and they're
also not equal.

So, what to do? Well, `CUDA_ARCH` was added in 2022. More recently,
another mechanism for conditional compilation was added:
`target_feature`, in Rust-GPU#239. This does work with the 'a' and 'f' suffix
targets, and it's more Rust-y.

So this commit just removes `CUDA_ARCH` and `ComputeCapability`
(removing two more places where the default compilation target is
specified) and changes the only uses (in `cuda_std/src/atomic/mid.rs`)
to use `target_feature` instead. We don't have any tests exercising
conditional compilation, alas, but I did some manual checking locally to
verify that it works the same.
nnethercote added a commit to nnethercote/Rust-CUDA that referenced this pull request Nov 26, 2025
CUDA C++ has the `__CUDA_ARCH__` macro for conditional compilation.
rust-cuda has a `CUDA_ARCH` environment variable that is similar, and
the `from_cuda_arch_env` method parses the environment variable's value
to produce a value of type `ComputeCapability`, which can be queried for
conditional compilation.

But `ComputeCapability` has a big problem. It's missing all the
capabilities after 80, including the 'a' and 'f' suffix ones. We could
just add them, but it implements `PartialOrd`/`Ord` and uses ordering to
determine feature availability. This was valid before the 'a' and 'f'
suffixes were added but is no longer, because some pairs of values are
incomparable. E.g. `100a` and `101a` -- each one has some features the
other doesn't, so neither is clearly larger than the other, and they're
also not equal.

So, what to do? Well, `CUDA_ARCH` was added in 2022. More recently,
another mechanism for conditional compilation was added:
`target_feature`, in Rust-GPU#239. This does work with the 'a' and 'f' suffix
targets, and it's more Rust-y.

So this commit just removes `CUDA_ARCH` and `ComputeCapability`
(removing two more places where the default compilation target is
specified) and changes the only uses (in `cuda_std/src/atomic/mid.rs`)
to use `target_feature` instead. We don't have any tests exercising
conditional compilation, alas, but I did some manual checking locally to
verify that it works the same.
LegNeato pushed a commit that referenced this pull request Nov 28, 2025
CUDA C++ has the `__CUDA_ARCH__` macro for conditional compilation.
rust-cuda has a `CUDA_ARCH` environment variable that is similar, and
the `from_cuda_arch_env` method parses the environment variable's value
to produce a value of type `ComputeCapability`, which can be queried for
conditional compilation.

But `ComputeCapability` has a big problem. It's missing all the
capabilities after 80, including the 'a' and 'f' suffix ones. We could
just add them, but it implements `PartialOrd`/`Ord` and uses ordering to
determine feature availability. This was valid before the 'a' and 'f'
suffixes were added but is no longer, because some pairs of values are
incomparable. E.g. `100a` and `101a` -- each one has some features the
other doesn't, so neither is clearly larger than the other, and they're
also not equal.

So, what to do? Well, `CUDA_ARCH` was added in 2022. More recently,
another mechanism for conditional compilation was added:
`target_feature`, in #239. This does work with the 'a' and 'f' suffix
targets, and it's more Rust-y.

So this commit just removes `CUDA_ARCH` and `ComputeCapability`
(removing two more places where the default compilation target is
specified) and changes the only uses (in `cuda_std/src/atomic/mid.rs`)
to use `target_feature` instead. We don't have any tests exercising
conditional compilation, alas, but I did some manual checking locally to
verify that it works the same.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant